Data:

Data we have at month-year-fishing area resolution:

Here is a (different) plot of the mean lengths at age by area and year class.

Here is a map of the state areas:

L vs \(\Delta\) L

The following plot shows change in length vs length. It is a subset of the total data set and only includes cases where sampling was conducted in the same state area for two consecutive months. The red line is the linear regression. It is not great and very noisy.

Linear model:

\(L_{a,y,r}\) is length of age \(a\) (in months) shrimp from the cohort \(c\) caught in area (region) r, and \(m\) is the month of the year (i.e., cycles back to 1 at the end of the year unlike \(a\)). To align \(m\) with the seasonal cycle of growth, \(m=1\) corresponds with March.

\[\begin{aligned} L_{1,c,r} &= \mu_{L1} + \gamma_{r} + \varepsilon_{1,c} \\ L_{a,c,r} &= \alpha_0 + \alpha_1 L_{a-1,c,r} + \alpha_s \sin\left(\tfrac{\pi m}{6}\right) + \varepsilon_{a,c} \\ \hat{L}_{a,c,r} &= L_{a,c,r} + \delta_{a,c,r} \end{aligned}\]

\(\mu_{L1}\) is average size of age 1 shrimp in April, \(\alpha_0\) and \(\alpha_1\) describe the average monthly growth, and \(\alpha_s\) is the magnitude of the seasonality component.

Distributional assumptions are: \[\begin{aligned} \varepsilon_{1,c} &\sim N(0, \sigma_{L1}) \\ \varepsilon_{a,c} &\sim N(0, \sigma_p), a\neq 1\\ \delta_{a,c,r} &\sim N(0, \sigma_o)\\ \gamma_r &\sim N(0, \sigma_r) \end{aligned}\]

Non-linear parameterization

A second parameterization was explored that allows the seasonal component to solely influence the growth rate, and not the asymptotic size. Only the second equation is different than the model above (instead of \(\alpha_0\) and \(\alpha_1\), the two model parameters are \(L_\infty\) and \(k\)):

\[ L_{a,c,r} = L_\infty\left[1-\exp\left(-ke^{\alpha_s\sin\left(\pi m/6\right)} \right) \right] + L_{a-1,c,r}\exp\left(-ke^{\alpha_s\sin\left(\pi m/6\right)}\right) + \varepsilon_{a,c} \]

Fitting:

The model is fit in Stan as a multivariate autoregressive state-space model. For the linear model, the global mean is subtracted from all lengths to improve numerical stability. Each year class is essentially a different “state”, and each area represents a different observation process of the same state. Three chains are run for 5000 iterations, and the final 2500 iterations of each chain are retained. [More chains fewer iterations?] This is probably not sufficient for formal estimation, but seemed good enough for a first pass at major patterns.

Diagnostics

Now we’ll look at posterior predictive intervals for the linear and non-linear models with and without process errors (i.e., fixing \(\varepsilon_{a,c}\) to zero and not estimating \(\delta_p\)). The inset plot is a distribution of recruitment across years with that cohort’s recruitment in red.

Linear model with process error

Linear model without process error

Non-linear model with process error

Non-linear model without process error

Results:

First, here is a table of the estimated parameters. The mixing for the initial size (not just the mean below, but also the parameters for each year and for the fishing area offsets) is poor, so the model should probably get run longer.

Second, here is a plot of residuals versus fitted values (using the median from the posterior as the fitted value) with a smoother. We see that the residuals look alright.

I also thought it would be useful to look at the process errors. We can see that the process model under-estimates growth for age 1 and over-estimates it for age 3. Oddly, \(\alpha_1\) is being estimated very close to its bound at 1. If it were smaller, that would allow growth to slow as shrimp get larger. Strong priors with minimal weight near 1 did not change this, but did slow mixing.

Next steps:

  1. Does anyone have any thoughts on why faster growth during the first year and slower growth especially during the second winter and age 3 are getting shunted into random effects? Is it important to fix? I’m not weighting anything by sample size. Maybe age 3 mean lengths are unreliable and need to be down-weighted? I recall that I tried doing that and the results were wonky for some reason. Maybe the data are just not informative enough for such a data-hungry state-space model?

  2. Compare with condition index (have data), look at possible oceanographic and density-related correlates with growth and condition?